NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fundamentals of Caching Layered Data Objects

Bari, Agrim; de_Veciana, Gustavo; Kesidis, George (July 2025, 45th IEEE International Conference on Distributed Computing Systems)

The effective management of the vast amounts of data processed or required by modern cloud and edge computing systems remains a fundamental challenge. This paper focuses on cache management for applications where data objects can be stored in layered representations. In such representations, each additional data layer enhances the “quality” of the object’s version, albeit at the cost of increased memory usage. This layered approach is advantageous in various scenarios, including the delivery of zoomable maps, video coding, future virtual reality gaming, and layered neural network models, where additional data layers improve quality/inference accuracy. In systems where users or devices request different versions of a data object, layered representations provide the flexibility needed for caching policies to achieve improved hit rates, i.e., delivering the specific representations required by users. This paper investigates the performance of the Least Recently Used (LRU) caching policy in the context of layered representation for data, referred to as Layered LRU (LLRU). To this end, we develop an asymptotically accurate analytical model for LLRU. We analyze how LLRU’s performance is influenced by factors such as the number of layers, as well as the popularity and size of an object’s layers. For example, our results demonstrate that, in the case of LLRU, adding more layers does not always enhance performance. Instead, the effectiveness of LLRU depends intricately on the popularity distribution and size characteristics of the layers.
more » « less
Free, publicly-accessible full text available July 20, 2026
Virtual Reality Benchmark for Edge Caching Systems

https://doi.org/10.1109/VRW66409.2025.00274

Alfares, Nader; Kesidis, George (March 2025, IEEE)

Free, publicly-accessible full text available March 8, 2026
Container Sizing for Microservices with Dynamic Workload by Online Optimization

https://doi.org/10.1145/3631311.3632399

Alfares, Nader; Kesidis, George (December 2023, ACM)

Full Text Available
Stash: A Comprehensive Stall-Centric Characterization of Public Cloud VMs for Distributed Deep Learning

https://doi.org/10.1109/ICDCS57875.2023.00023

Sharma, Aakash; Bhasi, Vivek M; Singh, Sonali; Jain, Rishabh; Gunasekaran, Jashwant Raj; Mitra, Subrata; Kandemir, Mahmut Taylan; Kesidis, George; Das, Chita R (July 2023, IEEE)

Deep neural networks (DNNs) are increasingly popular owing to their ability to solve complex problems such as image recognition, autonomous driving, and natural language processing. Their growing complexity coupled with the use of larger volumes of training data (to achieve acceptable accuracy) has warranted the use of GPUs and other accelerators. Such accelerators are typically expensive, with users having to pay a high upfront cost to acquire them. For infrequent use, users can, instead, leverage the public cloud to mitigate the high acquisition cost. However, with the wide diversity of hardware instances (particularly GPU instances) available in public cloud, it becomes challenging for a user to make an appropriate choice from a cost/performance standpoint. In this work, we try to address this problem by (i) introducing a comprehensive distributed deep learning (DDL) profiler Stash, which determines the various execution stalls that DDL suffers from, and (ii) using Stash to extensively characterize various public cloud GPU instances by running popular DNN models on them. Specifically, it estimates two types of communication stalls, namely, interconnect and network stalls, that play a dominant role in DDL execution time. Stash is implemented on top of prior work, DS-analyzer, that computes only the CPU and disk stalls. Using our detailed stall characterization, we list the advantages and shortcomings of public cloud GPU instances for users to help them make an informed decision(s). Our characterization results indicate that the more expensive GPU instances may not be the most performant for all DNN models and that AWS can sometimes sub-optimally allocate hardware interconnect resources. Specifically, the intra-machine interconnect can introduce communication overheads of up to 90% of DNN training time and the network-connected instances can suffer from up to 5× slowdown compared to training on a single instance. Furthermore, (iii) we also model the impact of DNN macroscopic features such as the number of layers and the number of gradients on communication stalls, and finally, (iv) we briefly discuss a cost comparison with existing work.
more » « less
Full Text Available
Multi-resource fair allocation for consolidated flash-based caching systems

https://doi.org/10.1145/3528535.3565245

Choi, Wonil; Urgaonkar, Bhuvan; Kandemir, Mahmut Taylan; Kesidis, George (November 2022, ACM/IFIP Middleware)

Full Text Available
Stash: A comprehensive stall-centric characterization of public cloud VMs for distributed deep learning

Sharma, Aakash; Bhasi, Vivek; Singh, Sonali; Jain, Rishabh; Raj, Jashwant; Mitra, Subrata; Kandemir, Mahmut Taylan; Kesidis, George; Das, Chita (January 2023, Proceedings of the International Conference on Distributed Computing Systems)

Deep neural networks (DNNs) are increasingly popular owing to their ability to solve complex problems such as image recognition, autonomous driving, and natural language processing. Their growing complexity coupled with the use of larger volumes of training data (to achieve acceptable accuracy) has warranted the use of GPUs and other accelerators. Such accelerators are typically expensive, with users having to pay a high upfront cost to acquire them. For infrequent use, users can, instead, leverage the public cloud to mitigate the high acquisition cost. However, with the wide diversity of hardware instances (particularly GPU instances) available in public cloud, it becomes challenging for a user to make an appropriate choice from a cost/performance standpoint. In this work, we try to address this problem by (i) introducing a comprehensive distributed deep learning (DDL) profiler Stash, which determines the various execution stalls that DDL suffers from, and (ii) using Stash to extensively characterize various public cloud GPU instances by running popular DNN models on them. Specifically, it estimates two types of communication stalls, namely, interconnect and network stalls, that play a dominant role in DDL execution time. Stash is implemented on top of prior work, DS-analyzer, that computes only the CPU and disk stalls. Using our detailed stall characterization, we list the advantages and shortcomings of public cloud GPU instances for users to help them make an informed decision(s). Our characterization results indicate that the more expensive GPU instances may not be the most performant for all DNN models and that AWS can sometimes sub-optimally allocate hardware interconnect resources. Specifically, the intra-machine interconnect can introduce communication overheads of up to 90% of DNN training time and the network-connected instances can suffer from up to 5× slowdown compared to training on a single instance. Furthermore, (iii) we also model the impact of DNN macroscopic features such as the number of layers and the number of gradients on communication stalls, and finally, (iv) we briefly discuss a cost comparison with existing work.
more » « less
Full Text Available
SHOWAR: Right-Sizing And Efficient Scheduling of Microservices

https://doi.org/10.1145/3472883.3486999

Baarzi, Ataollah Fatahi; Kesidis, George (November 2021, ACM SoCC)

Full Text Available
On Merits and Viability of Multi-Cloud Serverless

https://doi.org/10.1145/3472883.3487002

Baarzi, Ataollah Fatahi; Kesidis, George; Joe-Wong, Carlee; Shahrad, Mohammad (November 2021, SoCC '21: Proceedings of the ACM Symposium on Cloud Computing)

Full Text Available
CASH: A Credit Aware Scheduling for Public Cloud Platforms

https://doi.org/10.1109/CCGrid51090.2021.00032

Sharma, Aakash; Dhakshinamurthy, Saravanan; Kesidis, George; Das, Chita R. (August 2021, 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid))

Full Text Available
On a Caching System with Object Sharing

https://doi.org/10.1145/3429881.3430107

Alfares, Nader; Kesidis, George; Li, Xi; Urgaonkar, Bhuvan; Kandemir, Mahmut; Konstantopoulos, Takis (December 2020, Workshop on Middleware and Applications for the Internet of Things)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records